Batch Download
| Batch Download Help | |
|---|---|
|
Last Updated: 30 September 2008 The Batch Download tool provides access to a variety of data and data formats for a specified list of IDs. The specified list of IDs can be large (e.g. all genes) or small (e.g. one gene), but each ID should be provided on a separate line. The types of data available through Batch Download include FASTA sequences, XML files, and data from specified fields on reports. Results can be downloaded or viewed online. |
|
Getting Started
|
|
|
The Batch Download tool is organized into three main parts: Selection of Output Format (what type of data are you looking for), Output Options (for each format, there are multiple options), and the input area at the bottom of the page. To get started, you should select the type of data you'd like to retrieve. The types of data that can be accessed through Batch Download include:
After selecting an output format, select additional output options from the pull down menus. For example, if FASTA sequences are selected, you'll need to specify which type of sequences you'd like to retrieve (e.g. Gene region, translations, etc.) Output can either be sent to your browser or to a file. Choose which option you prefer using the drop down menu under "Send results to:". Finally, enter your list of IDs. This can be done either by entering a list manually, uploading a list from your computer, or by arriving at Batch Download via "Export hits to Batch Download" from a hits list. If you've opted to retrieve FASTA sequence or Database formatted data (XML), simply click "Get sequence" or "Get records" to retrieve your data. However, if you've selected Field Data, you'll need to complete another step, which is to select the exact fields of interest. For more detailed help, please see following sections that include step-by-step instructions for each output format. |
|
How to Download FASTA Sequence
|
|
|
There are several options for the download of FASTA sequences including:
In all cases except "Transposons" and "By sequence coordinates", you should enter gene symbols. For "Transposons" enter transposon symbols (e.g. P-element, hobo, copia). For "By sequence coordinates", enter GBrowse-style coordinates (e.g. 2R:8717636..8727699). Multiples of coordinates can be entered. Remember to specify whether you want your results returned in a browser window or written to a text file that you can save on your computer. |
|
How to Download Database XML
|
|
|
To retrieve XML, first select either Chado XML or Reporting XML. Then, enter the IDs for the features of interest. You can enter any type of feature for which there are reports on FlyBase (e.g. genes, alleles, insertions, natural transposons, etc.). Remember to specify whether you want your results returned in a browser window or written to a text file that you can save on your computer. |
|
How to Download Field Data
|
|
|
"Field Data" includes all the data that is presented in any of our reports. For example, for a list of genes, you can retrieve transcript expression data, GO: Molecular Function terms, lists of cDNA clones, etc. For a list of alleles, you can retrieve phenotypic class information, lists of stocks, etc. There are three output options: As HTML Table, As Tab Separated File, and From Precomputed Files. Use the first option if you are going to view your results in the browser window or print them out directly, since the output will be nicely formatted. However, if you'd like to save your output in a file or open it in a spreadsheet, choose to see the data tab separated. Finally, the option to see data from Precomputed Files allows you to obtain the data available in our precomputed files for your list of IDs. For example, if you have a list of genes, you can get the overlapping affy oligos from the "Genes: fbgn_exons2affy1_overlaps.tsv" file. There are many other options available from precomputed files, so it's worth taking a look at the list. It is important to note that Batch Download can only retrieve information about a single dataset at one time. For example, you cannot simultaneously retrieve field data for genes and alleles. If you input a list that contains a mix of gene and allele symbols, whichever data type has more entries will be recognized and the other data type will be ignored. If your list contains equal numbers, the identity of the first entry on the list will be recognized. Remember to specify whether you want your results returned in a browser window or written to a text file that you can save on your computer. Once you have entered your list of IDs, click "Select Fields". This opens a new window that includes a menu of all the available fields for your dataset of interest. Be aware that some data types have a very long list (genes, alleles, insertions, etc.) and other data types have shorter lists (clones, transcripts, etc.). This is also true if you are getting field data from precomputed files. The fields are organized first by the type of data they contain: Controlled Vocabulary (CV), Symbol, or Free Text (Text). Within each category, fields are listed alphabetically by their field label (the text that is found in the tan boxes on each report). Multiple fields can be selected at once by holding down the control key while clicking with the mouse (or holding down the command key on a Mac). Click "Get field data" to obtain your results. Important tip: If you entered Batch Download via a Hits List (i.e. you exported your hits from the Hits List to Batch Download), your data will be entered as FlyBase IDs (e.g. FBgn, FBal, etc.). If you want to see the associated symbols in your output, be sure to select the field "Symbol: symbol". |
|
Getting Started